Chantilly
Combining Distantly Supervised Models with In Context Learning for Monolingual and Cross-Lingual Relation Extraction
Rathore, Vipul, Faisal, Malik Hammad, Singla, Parag, Mausam, null
Distantly Supervised Relation Extraction (DSRE) remains a long-standing challenge in NLP, where models must learn from noisy bag-level annotations while making sentence-level predictions. While existing state-of-the-art (SoTA) DSRE models rely on task-specific training, their integration with in-context learning (ICL) using large language models (LLMs) remains underexplored. A key challenge is that the LLM may not learn relation semantics correctly, due to noisy annotation. In response, we propose HYDRE -- HYbrid Distantly Supervised Relation Extraction framework. It first uses a trained DSRE model to identify the top-k candidate relations for a given test sentence, then uses a novel dynamic exemplar retrieval strategy that extracts reliable, sentence-level exemplars from training data, which are then provided in LLM prompt for outputting the final relation(s). We further extend HYDRE to cross-lingual settings for RE in low-resource languages. Using available English DSRE training data, we evaluate all methods on English as well as a newly curated benchmark covering four diverse low-resource Indic languages -- Oriya, Santali, Manipuri, and Tulu. HYDRE achieves up to 20 F1 point gains in English and, on average, 17 F1 points on Indic languages over prior SoTA DSRE models. Detailed ablations exhibit HYDRE's efficacy compared to other prompting strategies.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Africa > Liberia (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (35 more...)
Readability $\ne$ Learnability: Rethinking the Role of Simplicity in Training Small Language Models
Lee, Ivan, Berg-Kirkpatrick, Taylor
Recent studies suggest that very small language models (SLMs) can generate surprisingly coherent text when trained on simplified, child-directed corpora such as TinyStories. These findings have been interpreted as evidence that readability -- characterized by accessible vocabulary, familiar narrative structure, and simple syntax -- plays a key role in enabling such capabilities to emerge. In this paper, we challenge that interpretation. We construct synthetic datasets with matched structure but varied readability, and find that readability alone does not predict coherence or learning efficiency in SLMs. Models trained on complex, adult-level text perform comparably to those trained on simplified language, and even exhibit faster development of coherence during training. Instead, we show that statistical simplicity, as measured by n-gram diversity, is a stronger predictor of learnability. Our findings caution against the growing trend of anthropomorphizing language model training -- drawing parallels to human cognitive development without empirical basis -- and argue for more precise reasoning about what properties actually support capability emergence in small models.
- North America > United States > Florida > Miami-Dade County > Miami (0.14)
- Asia > Russia (0.14)
- North America > United States > New Hampshire (0.04)
- (29 more...)
- Media (1.00)
- Leisure & Entertainment > Sports (1.00)
- Law (1.00)
- (6 more...)
Incorporating Stochastic Models of Controller Behavior into Kinodynamic Efficiently Adaptive State Lattices for Mobile Robot Motion Planning in Off-Road Environments
Damm, Eric R., Lancaster, Eli S., Sanchez, Felix A., Bronder, Kiana, Gregory, Jason M., Howard, Thomas M.
Mobile robot motion planners rely on theoretical models to predict how the robot will move through the world. However, when deployed on a physical robot, these models are subject to errors due to real-world physics and uncertainty in how the lower-level controller follows the planned trajectory. In this work, we address this problem by presenting three methods of incorporating stochastic controller behavior into the recombinant search space of the Kinodynamic Efficiently Adaptive State Lattice (KEASL) planner. To demonstrate this work, we analyze the results of experiments performed on a Clearpath Robotics Warthog Unmanned Ground Vehicle (UGV) in an off-road, unstructured environment using two different perception algorithms, and performed an ablation study using a full spectrum of simulated environment map complexities. Analysis of the data found that incorporating stochastic controller sampling into KEASL leads to more conservative trajectories that decrease predicted collision likelihood when compared to KEASL without sampling. When compared to baseline planning with expanded obstacle footprints, the predicted likelihood of collisions becomes more comparable, but reduces the planning success rate for baseline search.
- North America > United States > Virginia > Fairfax County > Chantilly (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > New York > Monroe County > Rochester (0.04)
- North America > United States > Maryland > Prince George's County > Adelphi (0.04)
- Government (0.69)
- Transportation > Ground > Road (0.41)
The Perceived Danger (PD) Scale: Development and Validation
Molan, Jaclyn, Saad, Laura, Roesler, Eileen, McCurry, J. Malcolm, Gyory, Nathaniel, Trafton, J. Gregory
There are currently no psychometrically valid tools to measure the perceived danger of robots. To fill this gap, we provided a definition of perceived danger and developed and validated a 12-item bifactor scale through four studies. An exploratory factor analysis revealed four subdimensions of perceived danger: affective states, physical vulnerability, ominousness, and cognitive readiness. A confirmatory factor analysis confirmed the bifactor model. We then compared the perceived danger scale to the Godspeed perceived safety scale and found that the perceived danger scale is a better predictor of empirical data. We also validated the scale in an in-person setting and found that the perceived danger scale is sensitive to robot speed manipulations, consistent with previous empirical findings. Results across experiments suggest that the perceived danger scale is reliable, valid, and an adequate predictor of both perceived safety and perceived danger in human-robot interaction contexts.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > Virginia > Fairfax County > Fairfax (0.04)
- North America > United States > Virginia > Fairfax County > Chantilly (0.04)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Health & Medicine (1.00)
- Government > Military (1.00)
SteLLA: A Structured Grading System Using LLMs with RAG
Qiu, Hefei, White, Brian, Ding, Ashley, Costa, Reinaldo, Hachem, Ali, Ding, Wei, Chen, Ping
Large Language Models (LLMs) have shown strong general capabilities in many applications. However, how to make them reliable tools for some specific tasks such as automated short answer grading (ASAG) remains a challenge. We present SteLLA (Structured Grading System Using LLMs with RAG) in which a) Retrieval Augmented Generation (RAG) approach is used to empower LLMs specifically on the ASAG task by extracting structured information from the highly relevant and reliable external knowledge based on the instructor-provided reference answer and rubric, b) an LLM performs a structured and question-answering-based evaluation of student answers to provide analytical grades and feedback. A real-world dataset that contains students' answers in an exam was collected from a college-level Biology course. Experiments show that our proposed system can achieve substantial agreement with the human grader while providing break-down grades and feedback on all the knowledge points examined in the problem. A qualitative and error analysis of the feedback generated by GPT4 shows that GPT4 is good at capturing facts while may be prone to inferring too much implication from the given text in the grading task which provides insights into the usage of LLMs in the ASAG system.
- North America > United States > Washington > King County > Seattle (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.14)
- (9 more...)
Comparative Analysis of LSTM, GRU, and Transformer Models for Stock Price Prediction
Xiao, Jue, Deng, Tingting, Bi, Shuochen
In recent fast-paced financial markets, investors constantly seek ways to gain an edge and make informed decisions. Although achieving perfect accuracy in stock price predictions remains elusive, artificial intelligence (AI) advancements have significantly enhanced our ability to analyze historical data and identify potential trends. This paper takes AI driven stock price trend prediction as the core research, makes a model training data set of famous Tesla cars from 2015 to 2024, and compares LSTM, GRU, and Transformer Models. The analysis is more consistent with the model of stock trend prediction, and the experimental results show that the accuracy of the LSTM model is 94%. These methods ultimately allow investors to make more informed decisions and gain a clearer insight into market behaviors.
- North America > United States > Virginia > Fairfax County > Chantilly (0.04)
- North America > United States > New Jersey > Hudson County > Jersey City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (2 more...)
- Banking & Finance > Trading (1.00)
- Transportation > Ground > Road (0.50)
Data Processing for the OpenGPT-X Model Family
Brandizzi, Nicolo', Abdelwahab, Hammam, Bhowmick, Anirban, Helmer, Lennard, Stein, Benny Jörg, Denisov, Pavel, Saleem, Qasid, Fromm, Michael, Ali, Mehdi, Rutmann, Richard, Naderi, Farzad, Agy, Mohamad Saif, Schwirjow, Alexander, Küch, Fabian, Hahn, Luzian, Ostendorff, Malte, Suarez, Pedro Ortiz, Rehm, Georg, Wegener, Dennis, Flores-Herr, Nicolas, Köhler, Joachim, Leveling, Johannes
This paper presents a comprehensive overview of the data preparation pipeline developed for the OpenGPT-X project, a large-scale initiative aimed at creating open and high-performance multilingual large language models (LLMs). The project goal is to deliver models that cover all major European languages, with a particular focus on real-world applications within the European Union. We explain all data processing steps, starting with the data selection and requirement definition to the preparation of the final datasets for model training. We distinguish between curated data and web data, as each of these categories is handled by distinct pipelines, with curated data undergoing minimal filtering and web data requiring extensive filtering and deduplication. This distinction guided the development of specialized algorithmic solutions for both pipelines. In addition to describing the processing methodologies, we provide an in-depth analysis of the datasets, increasing transparency and alignment with European data regulations. Finally, we share key insights and challenges faced during the project, offering recommendations for future endeavors in large-scale multilingual data preparation for LLMs.
- Europe > Germany (0.14)
- Asia > Middle East > Jordan (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- (9 more...)
- Overview (0.86)
- Research Report > New Finding (0.46)
- Law (1.00)
- Government (1.00)
- Information Technology > Security & Privacy (0.93)
- Information Technology > Software (0.71)
Unsupervised Speaker Diarization in Distributed IoT Networks Using Federated Learning
Bhuyan, Amit Kumar, Dutta, Hrishikesh, Biswas, Subir
This paper presents a computationally efficient and distributed speaker diarization framework for networked IoT-style audio devices. The work proposes a Federated Learning model which can identify the participants in a conversation without the requirement of a large audio database for training. An unsupervised online update mechanism is proposed for the Federated Learning model which depends on cosine similarity of speaker embeddings. Moreover, the proposed diarization system solves the problem of speaker change detection via. unsupervised segmentation techniques using Hotelling's t-squared Statistic and Bayesian Information Criterion. In this new approach, speaker change detection is biased around detected quasi-silences, which reduces the severity of the trade-off between the missed detection and false detection rates. Additionally, the computational overhead due to frame-by-frame identification of speakers is reduced via. unsupervised clustering of speech segments. The results demonstrate the effectiveness of the proposed training method in the presence of non-IID speech data. It also shows a considerable improvement in the reduction of false and missed detection at the segmentation stage, while reducing the computational overhead. Improved accuracy and reduced computational cost makes the mechanism suitable for real-time speaker diarization across a distributed IoT audio network.
- North America > United States > Virginia > Fairfax County > Chantilly (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Michigan (0.04)
- (2 more...)
- Media (0.46)
- Information Technology (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Standardizing Knowledge Engineering Practices with a Reference Architecture
Allen, Bradley P., Ilievski, Filip
Knowledge engineering is the process of creating and maintaining knowledge-producing systems. Throughout the history of computer science and AI, knowledge engineering workflows have been widely used given the importance of high-quality knowledge for reliable intelligent agents. Meanwhile, the scope of knowledge engineering, as apparent from its target tasks and use cases, has been shifting, together with its paradigms such as expert systems, semantic web, and language modeling. The intended use cases and supported user requirements between these paradigms have not been analyzed globally, as new paradigms often satisfy prior pain points while possibly introducing new ones. The recent abstraction of systemic patterns into a boxology provides an opening for aligning the requirements and use cases of knowledge engineering with the systems, components, and software that can satisfy them best. This paper proposes a vision of harmonizing the best practices in the field of knowledge engineering by leveraging the software engineering methodology of creating reference architectures. We describe how a reference architecture can be iteratively designed and implemented to associate user needs with recurring systemic patterns, building on top of existing knowledge engineering workflows and boxologies. We provide a six-step roadmap that can enable the development of such an architecture, providing an initial design and outcome of the definition of architectural scope, selection of information sources, and analysis. We expect that following through on this vision will lead to well-grounded reference architectures for knowledge engineering, will advance the ongoing initiatives of organizing the neurosymbolic knowledge engineering space, and will build new links to the software architectures and data science communities.
- South America > Paraguay > Asunción > Asunción (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- (13 more...)
NASA space shuttle installed at site of future Los Angeles science museum
Former ISS commander Terry Virts joined'Fox & Friends' to discuss the significance of the mission as the American rocket heads to the moon for the first time in 50 years. NASA's retired Space Shuttle Endeavour was carefully hoisted late Monday to be mated to a huge external fuel tank and its two solid rocket boosters at a Los Angeles museum where it will be uniquely displayed as if it is about to blast off. A massive crane delicately began lifting the orbiter, which is 122 feet long and has a 78-foot wingspan, into the partially built Samuel Oschin Air and Space Center at the California Science Center in Exposition Park. The building will be completed around Endeavour before the display opens to the public. The 20-story-tall display stands atop an 1,800-ton concrete slab supported by six so-called base isolators to protect Endeavour from earthquakes.
- North America > United States > California > Los Angeles County > Los Angeles (0.67)
- North America > United States > Virginia > Fairfax County > Chantilly (0.06)
- North America > United States > New York (0.06)
- North America > United States > Florida > Brevard County (0.06)
- Government > Space Agency (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)